Learn R Programming

microseq (version 1.0)

readFasta and writeFasta: Read and write FASTA files

Description

Reads and writes biological sequences (DNA, RNA, protein) in the FASTA format.

Usage

readFasta(in.file) writeFasta(fdta, out.file, width = 80)

Arguments

in.file
url/directory/name of FASTA file to read.
fdta
A Fasta object, see ‘Details’ below.
out.file
Name of FASTA-file to create.
width
Number of sequence characters per line.

Value

readFasta returns a Fasta object with the contents of the FASTA file. This is an extension to a data.frame and contains two columns of text. The first, named Header, contains the headerlines and the second, named Sequence, contains the sequences.writeFasta produces a FASTA file.

Details

These functions handle input/output of sequences in the commonly used FASTA format. For every sequence it is presumed there is one Header-line starting with a ‘>’.

The sequences are stored in a Fasta object. This is an extension of a data.frame containing two text-columns named Header and Sequence. If other columns are present, these will be ignored by writeFasta.

The Fasta object can be treated as a data.frame, but the generic functions plot.Fasta and summary.Fasta are defined. The data.frame property makes it straightforward to manipulate all headers or all sequences, or to extract or delete entries (rows), or to merge several data sets using rbind.

readFasta makes use of readBStringSet and writeFasta makes use of writeXStringSet in the Biostrings package.

See Also

plot.Fasta, summary.Fasta, readFastq.

Examples

Run this code
ex.file <- file.path(file.path(path.package("microseq"),"extdata"),"small.fasta")
fdta <- readFasta(ex.file)
summary(fdta)
plot(fdta)

Run the code above in your browser using DataLab